HCP with PSMA: A Robust Spoken Language Parser
نویسندگان
چکیده
" Spoken language " is a field of natural language processing, which deals with transcribed speech utterances. The processing of spoken language is much more complex and complicated than processing standard, grammatically correct natural language , and requires special treatment of typical speech phenomena called " disfluencies " , like corrections , interjections and repetitions of words or phrases. We present a parsing technique that utilizes a Hybrid Connectionist Parser (HCP) extended with a Phrase Structure Matching Algorithm (PSMA) for the syntactic processing (parsing) of spoken language. The HCP is a hybrid of a connec-tionist and a symbolic approach, which builds a neural network dynamically based on a given context free grammar and an input sentence. It has an advantage over traditional parsers due to the use of graded activation and activation passing in the network. These features were exploited in techniques to detect and repair disfluencies, in combination with the Phrase Structure Matching Algorithm , which operates on the graphical structure of the network. This approach to spoken language parsing is purely syntactical and – in contrast to other work – does not require a modified grammar for spoken language, or information derived from the speech input, nor any pre-marking of disfluen-cies. We implemented the combined HCP and PSMA parser and tested it on a standard corpus for spoken language, the HCRC Map Task Corpus. The experiments showed very good results, especially for a purely syntactic method, which detects as well as corrects disfluencies.
منابع مشابه
A robust parser for spoken language understanding
This paper describes a robust parsing algorithm for spoken language understanding. Comparing with the other work in robust parsing, we focus on building a parser that is robust to not only ill-formed spontaneous spoken language inputs but also under-specified grammars. Preliminary experiment results show that the parsing performance deteriorates more gracefully than another parser we have used ...
متن کاملStudies on Robust Language and Dialogue Processing for Spoken Dialogue Systems
In spoken dialogue systems, robust language processing for spontaneous speech understanding and robust dialogue processing for achieving user goal are inevitable. Previously, research of speech recognition and research of natural language understanding were done independently. At first glance, it seems to be no problem to combine these two technologies, because the purpose of speech recognition...
متن کاملParsing spoken language without syntax
Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances. To overpass this problem, we propose in this paper to handle the spoken language without considering syntax. We describe thus a microsemantic parser which is uniquely based on an associative network of semantic priming. Experimental results on spontaneous speech show that this parser st...
متن کاملParsing Spoken Language without Syntax : a Microsemantic Approach
Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances. To overpass this problem, we propose in this paper to handle the spoken language without considering syntax. We describe thus a microsemantic parser which is uniquely based on an associative network of semantic priming. Experimental results on spontaneous speech show that this parser st...
متن کاملRobust language understanding in mipad
MiPad is an application prototype for the study of conversational, multi-modal interface in Microsoft Research. It has a Tap and Talk interface that allows users to effectively interact with a PDA device. The major Spoken Language Understanding (SLU) engine component behind MiPad is a robust chart parser. This paper discusses some novel features of the parser that enable it to take full advanta...
متن کامل